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Abstract 

We present a study on the near equilibrium dynamics of two small proteins in the family 
of truncated hemoglobins, developed under the framework of a Gaussian network approach. 
Effective beta carbon atoms are taken into account besides C a s for all residues but glycines in the 
coarse-graining procedure, without leading to an increase in the degrees of freedom (/^Gaussian 
Model). Normalized covariance matrix and deformation along slowest modes with collective 
character are analyzed, pointing out anti-correlations between functionally relevant sites for 
the proteins under study. In particular we underline the functional motions of an extended 
tunnel-cavity system running inside the protein matrix, which provide a pathway for small ligands 
binding with the iron in the heme group. We give a rough estimate of the order of magnitude 
of the relaxation times of the slowest two overdamped modes and compare results with previous 
studies on globins. 
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I. INTRODUCTION 



Several studies in the past decades have shown the validity of the normal modes approach 
to extract useful information on the large-scale functional movements of proteins near their 
native state confection QQSfla. 

Molecular dynamics simulations of biomolecules performed using detailed all atoms po- 
tentials vield lots of information regarding large amplitude, concerted displacements of 



aten, fl. However these can b c si mply obtained within the hannonic app—ion and 
from the analysis of the hessian matrix: in fact only low-frequency modes provide the ma- 
jor part of the norm for those global motions, whereas the fastest modes account for only 
spatially localized fluctuations [7,0]. 

It has become customary to project the dynamical trajectories of the atoms in the 
molecules onto normal mode axes P]; thus one is brought to interpret the functional, large 
amplitude motions of biological relevance for proteins as superpositions of independent har- 
monic modes of oscillations of a network of atoms. 

A pioneering work developed by Tirion 0] paved the way for extremely simplified Normal 
Mode Analysis (NMA): detailed harmonic potentials are replaced by a single-parameter, 
spring-like potential between atoms found to be in contact in the native configuration. 

Despite the extreme simplicity of this approach, the good agreement obtained with atomic 
mean square displacements of molecular dynamics simulation s llOf opened the possibility 



nulation s |10| opened tl 

0,0, HQ, HQ: 



for further studies, within the same approach [jjj, |12 , a good level 

of consistency with more accurate analyses is achieved even treati ng p roteins under coarse- 
grained schemes, as recently shown by Bahar and co-workers 0, 0> 0> 0, flfij - who 
developed simple yet useful models to explore the collective motions of proteins, profitably 
adopted by other groups 0, 0] . 

In the present study two structures recently solved 0] are addressed, which belong 
to the family of truncated hemoglobins (trHbs), small heme proteins widely distributed 
in bacteria, protozoa and plants, forming a distinct group within the hemoglobin super- 



family 



22, m 



Though having a simpler structure than the traditional globin fold, they still preserve 
the respiratory function, providing transport and storage of oxygen molecules. Furthermore 
they have been proposed to be involved also in other biological functions, such as protection 
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aga inst reactive nitrogen species, photosynthesis or to act as terminal oxidases 22|, |25j, |26 



27]. 



The low complexity of trHbs structure, compared to normal globin folds, might help the 
comprehension of the mechanisms used by these shorter molecules to bind small ligands to 
the heme iron atom (e.g.: O2, their main target, and CO, to which heme has a high affinity). 

In particular, the presence of an apolar cavity system extending throughout the protein 
matrix of truncated hemoglobin from Mycobacterium Tuberculosis and homologous struc- 
tures has been recently noticed 0, S] : this tunnel connects the heme distal pocket to the 
protein surface, and may thus allow an efficient diffusion path for oxygen and other small 
molecules to the iron atom (figjlj). 

The role of protein cavities has been deeply investigated in myoglobin (see 0,0, Q 



and references therein), both theoretically using computer simulations and experimentally 
suggesting pathways for ligands migration switched by a small number of substates, which 
can be allosterically converted to the stable conformations j^l|. 

These issues are investigated here from a novel point of view, through a simple coarse- 
grained scheme in the spirit of the Gaussian chain models, with a twofold goal: understanding 
the mechanical processes involved in the functional movements of these key proteins and 
taking advantage of this new Gaussian framework, computationally fast and conceptually 
simple. 



II. STRUCTURAL CHARACTERIZATION 

The structures addressed in the present study are two truncated hemoglobins from the 
ciliated protozoan Paramecium Caudatum (PtrHb, pdb id: ldlw) and the green unicellular 
alga Chlamydomonas Eugametos (CtrHb, pdb id: Idly), solved at 1.54 and 1.80 A resolution 
respectively, by Pesce et al. j^jj. 

Similarly to the other proteins belonging to the trHb family, they display low sequence 
identity with hemoglobins from vertebrate and non vertebrate. This is smaller than 15% for 
PtrHb and CtrHb, due to substantial residue deletions at either N- or C- termini and in the 



C and D helical region of the globin fold 



2l|. 



More than 70% of the residues in the two structures belongs to helices, mainly of type 
a (above 67% in both proteins: only the short helix C is of type 3io): this is a typical 
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FIG. 1: Truncated hemoglobin fold from Paramecium Caudatum: helices, coils and the main 
residues described in the text are labeled according to the standard nomenclature for globins. The 
two-over-two helical structure enclosing the heme group is clearly visible: (a) side view, (b) top 



view. Figure drawn using Molscript 



321 ] and Raster3d 



33]. 



feature of the globin fold, which leads to guess a primary role of helices in the functional 
motions of these proteins, as well as in myoglobin and hemoglobin. Nonetheless several 
structural differences make truncated hemoglobins fall in a distinct group in the hemoglobin 
superfamily 2l|,l3- 

Helices in the globin fold are traditionally indexed through capital letters A, B, C, D, 
E, F, G and H, while loops between them are named according to the nearby helices, and 
residues are numbered sequentially with each unit Q]. 

The structures taken in consideration here reveal the so called "two over two a helical 
sandwich" (fig-HJ), in place of the classical "three over three" observed in the globin fold 3^]: 
in fact helix D is absent, while N-terminal A helix and proximal F helix are drastically 
reduced to only one turn. 

A structure-based sequence alignment of PtrHb, CtrHb and other truncated hemoglobins 
with sperm whale Myoglobin, reported in j^jj, shows the strongly conserved residues among 
proteins in the trHbs family, mainly of three types: 
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glycine rich motifs especially at helices termini, which enhance structural flexibility (Gly- 
Gly motifs at the beginning of the AB and EF regions, and Gly-Arg/Lys in the pre-F 



region 



2l|) 



hydrophobic residues on heme distal and proximal sides, which play the main role of 
shielding heme from solvent molecules, in order to prevent iron oxidation ; 

heme binding residues stabilizing the porphyrin ring in the heme pocket; one particu- 
larly relevant is the proximal histidine, His 68, localized on helix F. 

Strongly conserved residues on the distal side responsible for the shielding of the heme 
pocket from the solvent are mainly localized on helices B and E, as well as in the CD and 
EF loops: hydrophobic residues Phe A12, B9, CD1, E14 and Trp EF7, with their side chains 
pointing to the inner part of the molecule; Tyr BIO, Gin E7, with side chains responsible 
for the stabilization of the ligand bound to heme 0> ^24 1 . 



The hydrophobic residues identified in [2l|, |28| as the ones defining a cavity inside the 
molecule, linking the solvent exposed surface of the proteins to the heme group are positioned 
on the distal side. They are mainly localized on helices A (at the opening of the tunnel on 
the surface), B, E (limiting the distal side) and G. 

On the proximal side of the heme pocket one finds the proximal histidine, in a strongly 
conserved position within hemoglobin (Hb) and trHb families: the imidazole ring of histidine 
allows it to act as either a proton donor or acceptor at physiological pH. In hemoglobins 
is essential its ability to buffer the H + ions from carbonic acid ionization in red blood 
cells, allowing the molecule to exchange O2 and CO2 respectively at the tissues and at the 
lungs 0. 

It will be shown how the small a helix F, which contains the proximal histidine F8, 
can play a leading role as a reference position for elucidating the functional motions of the 
protein regions around the heme pocket. 



III. THEORY 

The model adopted in this study is the Beta Gaussian Model {(3GM) presented in 
a single parameter model apt to describe small amplitude fluctuations of residues around 
their native-state equilibrium: the model is based on the Gaussian Network Model (GNM) 
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and on the Anisotropic Gaussian Model (ANM), which have been successfu 



3 0, D Q Q Q 



ly exploited in 



under the 



previous studies on the functional motions of proteins 
framework of the single parameter model introduced by Tirion [H 

Only alpha and beta carbon atoms (C a , C 13 ) are treated: rather than the actual C 3 , the 
latter is an effective centroid accounting for the directionality of the side chain, built for 
all residues but glycines and terminus ones; its position is determined by the coordinates of 
neighbouring a carbons [19I . \$\ , according to the following relation: 

r i - r i "I" J | Q _ _ , 

where the vectors rf and rf hold the native coordinates (in Angstroms) of the a carbon 
atom and of the effective /3 centroid which belong to the ith residue. Expanding the dis- 
placement of the C 13 from the equilibrium to leading order in the displacements of the C a s 
one gets: 

The hamiltonian of the system depends quadratically on the deviations of the C a and C 13 
from their native positions, assumed to be the energy minimum in the configuration space 
(thus neglecting crystal effects on X-ray structures): the displacements of protein's atoms 



from t 
tion ll 



re equilibrium position are supposed to be small enough to justify this approxima- 



The hamiltonian includes interactions between a and (3 carbons lying within a cut-off 
distance r c , above which no pairwise interaction is allowed, as well as an effective interaction 
accounting for the strength of the peptide bond for nearest-neighbouring C a s: 



n = n peptide + n aa + n a(3 + n pp (3) 



where 



n xy = f (i-^)ee ^sro*. v) Sr iJ r l ( 4 ) 

7 P is the elastic constant accounting for the relative strength of the effective peptidic inter- 
action between nearest-neighbouring a carbons; 
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7xj/ is the elastic constant for the contact interaction between carbon atoms of type x and 
y (x,y G {a,P})', 

5 xy is Kronecker delta to avoid double counting of the interactions between atoms of the 
same type; 

6r^ is the displacement from the native position of the carbon atom of type x that belongs 
to the ith residue (/i and v are the indexes of the Cartesian components); 

Aiij(x,y) (i 7^ j) is a (3 x 3) matrix, the off-diagonal super-element of the hessian matrix 
for the interaction between atoms of type x and y which belong to residues % and j: 

xy xy 

M%(x,y) = r*y r ^£ (5) 

where Y x ^ (i ^ j) is equal to 1 if the native separation of the corresponding atoms lies 
below the cut-off radius r c , otherwise; rfj = — is the vector of native separation 
of atoms of type x and y that belong to residues i and j respectively. Entries of 
diagonal super-elements are built according to the relation: 

M^(x,y) = -J2M^(x,y) (6) 

Since the position of the effective C 13 and its displacement from equilibrium are fully 
determined by a carbons coordinates ( equations (JTJ) and (J2J) ), by substitution of (JTJ) and 
(J2J) in (@J) one is left with an effective hamiltonian 7i which depends quadratically on C a 
displacements from native state (the index of atom type will be therefore dropped in 
the following equations for simplicity): 

ij fiu 

where j p and j xy (x,y G {a,/3}) have been incorporated in Aiij, expressed in units of the 
reference elastic constant 7. 

Time dependent two-point correlation functions can be calculated within a Langevin 
dynamics leading to equilibrium with the Boltzmann factor [3]. In the overdamped 

regime with the viscous damping factor /, the same for all residues and white noise 
rjiit) , the Langevin equation for our system is 0]: 

fj t ^At) + tE^ = »*(*) (8) 



One can easily get from equation (JHJ) the time dependence of cross correlations between 
couples of C a s (the so-called "reduced" cross-correlations): 

(6n(t) ■ 6Tj(0)) = ^ V i (a lfe • a jfc ) e"'* * (9) 
^ k k 

t = - is the reference relaxation time, corresponding to an overdamped spring of elastic 
constant 7 in a dissipative medium of friction /; If. are non zero eigenvalues of M. and 
the corresponding eigenvectors. 

Theoretical B-factors (measured in A 2 ) are obtained from the diagonal elements of the 
reduced covariance matrix (i.e. from the mean square fluctuations of C a s around native-state 
equilibrium, after thermal equilibrium has been reached), through the relation: 

Bi = (6r t ■ Sv,) (10) 

3 7 

Equation (fTUj) will be used to fit the experimental B-factors and get an estimate of the 
elastic constant 7. 



A. Tuning model parameters 

In order to obtain reliable data for the structures under study, we compare theoretical and 
experimental results using the ranking correlation between the two data sets as a guideline 
to tune model parameters to their optimal values. 

ANM was applied on the structures as well as f3GM: in the case of the trHb, the the- 
oretical temperature factors obtained with the (3GM showed a higher value of Kendall's 
non parametric r (3] (see below) against the experimental ones (r = 0.45 for ANM with 
r c = 13.0 A , r = 0.57 for /3GM with r c = 7.0 A , in the case of 1DLW). 

ANM works very well for bigger complexes, while for smaller proteins more details are 
required: the reason for the better agreement obtained by the /3GM is to be found in the 
presence of the f3 centroids, which considerably increases the number of pairwise interactions 
and takes into account the directionality of the side chains is in the contact map of a carbons. 

As a consequence, the /3GM needs a lower and more realistic cut-off radius r c to repro- 
duce experimental B-factors and molecular dynamics data, in comparison to those used by 
ANM as already remarked in a previous study Q], even with small proteins like trHbs: 
hence the choice to use the /3GM in the present work. 
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Here in particular, the best agreement between theory and experiment was found using 
a cut-off of 7.0 A. This choice is imposed by the difference in compactness between helical 
regions and coils, and it is critical in order to keep the density of effective contact interactions 
at the coarse-grained level comparable to the all atoms one. 

Larger cut-offs cause contact density to be overestimated for the helical regions, leading 
to smaller values of B-factors, with respect to the experiment: the consequence is a marked 
difference between flexible and solvent exposed parts of the protein, compared to the less 
flexible and buried parts (fig. EJ) • 

A key point was the tuning of j p , the ratio between the effective peptide bond and the aa 
interaction: it accounts for the relative stiffness of the covalent bonds along the backbone 
as opposed to the weaker contact interactions between C a pairs. 

Summarizing the values for the parameters used in the calculation for both structures, 
r c = 7.0 A, 7 p = 2.0, 7 QQ = 7 Q/ 3 = 7^ = f .0 (the last ones are in units of 7). The value 
of the elastic constant 7 will be determined later, fitting the results of the model with the 
available experimental data. 



B. Temperature factors and heme modeling 

Truncated hemoglobins are heme proteins, the heme group being the active site of the 
molecule: there oxygen and carbon oxide bind to the sixth coordination position of the iron 
atom, which lies at the center of the tetrapyrrole ring and is bound to the imidazole ring 
of the proximal histidine F8 at the fifth coordination site (His 68, eighth residue of helix F 
in sperm whale myoglobin and in vertebrate hemoglobins, where nomenclature "F8" comes 
from Q)- 

Figure El shows the plot of the a carbon atoms B-factors of the X-ray structure of trun- 
cated hemoglobin in Paramecium Caudatum and their corresponding mean square displace- 
ments derived from the /3GM: most mobile regions are loops and turns between helices, which 
on the contrary display smaller fluctuations, in agreement with the results of an NM A study 
performed on deoxymyoglobin (Mb) [dl. 42 1. 



The significance of the correlation between experimental and theoretical values is deduced 
from Kendall's non parametric r [39]. Since one does not know a priori the probability 
distribution of the experimental B-factors, a significance for the agreement between the two 
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FIG. 2: Theoretical (black) versus experimental (gray) X-ray B-factors for a carbons in PtrHb, 
related through equation I1U1 Theoretical B-factors including coarse-grained heme group are shown 
for comparison (dashed). Helical segments have been marked on residues axis. The inset shows 
theoretical versus experimental B-factors for coarse-grained heme, with pdb names of iron and 
carbons included in the coarse-graining. 

data sets cannot be computed from the value of the linear correlation coefficient. On the 
other hand, the rank correlation given by r is independent from the distribution. Kendall's r 
for PtrHb is 0.56 (0.52 with heme), for CtrHb is 0.37 (0.40 with heme), and P nun {r) < 10~ 9 
in all cases (P nu ii(T) is the probability for two random sets of data to have a value of r bigger 
than the one found between B-factors predicted by the model and calculated from X-ray 
structure. The number of residues is 116 for PtrHb and 121 for CtrHb). 

The coarse-graining on the heme group includes the iron atom and nine carbons of the 
porphyrin ring (whose names are reported on the x axis of the inset in fig. |2J), chosen in 
order to keep the number of contacts in the modeled system comparable to the number of 
heme native contacts with nearby residues, thus avoiding to have a loosely connected group 
as an artifact of the coarse-graining procedure. 

Insertion of heme brings only one relevant change to the temperature factors plot (fig. |2J): 
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helix F has displacements from equilibrium considerably damped, as it was expected, being 
bound to the iron atom. A reduction in the fluctuations is shown also by the loops between 
helices C and F, to a lesser extent than in helix F. The protein part of the reduced covari- 
ance matrix obtained including the coarse-grained heme was compared with the covariance 
matrix computed without modeling the tetrapyrrole ring. The two show a Kendall's para- 
metric correlation r ~ 0.81 over more than thirteen thousands of points, which stands for 
a remarkable agreement between them: the coarse-grained heme in fact anticorrelates with 
the same parts of the protein as helix F, even if more weakly (data not shown). This is not 
surprising, since the iron atom and the proximal histidine F8 are in direct contact, so the 
motion of the heme group will be strongly correlated with the one of the F helix, following 
the proximal side in its deviations from native-state equilibrium: the inclusion of few more 
atoms under the coarse-grained scheme adopted here do not seem to significantly modify 
the correlations. The mechanical response of the protein upon binding of ligands on the 
iron atom is given by the properties of the network of backbone atoms: thus a good agree- 
ment with known behavior of globins may be achieved usi ng g aussian models even without 
considering heme groups in the coarse-graining procedure |43j |. 

The /3GM heme B-factors plot is in substantial agreement with the experimental B-factors 
for heme (fig. smaller plot). In fact the heme pocket is entirely surrounded by non polar 
residues: one of the main purposes of the distal re gion is to screen the heme group from 
solvent interaction, in order to avoid iron oxidation 401 . 

The r esu, t8 for Kendal ranMn g he in the ty p ic al range of gaU8si an m ooe.s Q, even 
with terminal residues included, and the confidence of the correlation is no doubt statisti- 
cally significant: still there are some regions of the protein whose fluctuations are not well 
reproduced by the model, as shown in the plot of temperature factors (fig. |2J). 

The model overestimates interactions between a and (3 carbons belonging to secondary 
structures, resulting in local deviations from the density of the all atom picture. Hence 
displacements of residues belonging to helical regions are underestimated, since these are 
the most compact parts of the protein and it produces deviations in the profile of B-factors, 



44j and on the local packing 



whose values depend both on the assembly of secondary motifs 
density Q 

Furthermore, electrostatics and solvent exposure for different residues are not taken into 
account by the simple approach of the model: electrostatic interactions localized on helices 
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may modify the magnitude of the driving forces producing larger displacements from native 
state than expected. 



POPS program (Parameter OPtimized Surface |46j]) has been used to calculate the sol- 
vent accessible surface area per residue for PtrHb: the most exposed residues are the 
ones displaying the greatest average displacements from the native structure, as it was 
expected (figg. EJEJ). These small residues (Gly 35, the GGP region - Gly54, Gly55, Pro56 - 
Thr60, Gly61), located in loops CD, EF and to the pre-F region, allow larger flexibility to the 
polypeptide chain (glycines especially) and the bigger fluctuations predicted by the model 
are due to their diminished connectivity as well, being the most exposed to the solvent. This 
was expected, since the model totally neglects solvent exposure. 

The simplified approach used here shows a remarkably better agreement with experiment, 
for buried regions, where the connectivity of atoms is greater and the solvent plays a minor 
role. 



IV. RESULTS AND DISCUSSION 

In order to identify the relevant motions of the protein the reduced covariance matrix plot 
(figure EJ) of PtrHb is inspected (PtrHb will be the main target of the following discussion, 
the same considerations holding for CtrHb as well), normalized as follows: 

<* = , {5ri ' 6 ! j) (ii) 

a/ (Sri ■ Sri) (Stj ■ 5rj) 

Normalization is generally performed in order to allow a direct comparison between the 
cross-correlations predicted by the model and the ones obtained in computer simulations, 
e.g. from molecular dynamics, provided equilibration has been reached |47|- 

From the reduced normalized covariance matrix one is able to extract non trivial infor- 
mations on the collective motions of the protein under study: these generally involve the 
regions of the molecule that show negative correlations. 

Indeed it turns out that spatially closed parts of the molecule, i.e. residues in contact, 
undergo motions with positive correlation, as one would expect for contact- driven motions. 

One can identify three main blocks in the covariance plot (fig. EJ): the first one contains 
helices A, B, C, E, the loops between them and the EF loop; the second one includes clearly 
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FIG. 3: Normalized covariance matrix: trivial correlations due to contacts have been put to 
(green); diagonal elements are equal to 1 (red); anti correlation range from to the minimum 
value found, for Gln41 (E7) and His68 (F8), lower then -0.35 (blue). Helical regions have been 
highlighted. 

the preF-loop, heme bound helix F, as well as the first part of helix G, while the third block 
hosts the major part of helix G and the C-terminal side of helix H. 

Most residues in the first block, especially the ones belonging to helical regions A, B and 
E (distal side), show a remarkable anti-correlation with residues localized at the beginning 
of the second block, belonging to the proximal helix F and to helix G; in the third block the 
last turns of helix H is bent at the C-terminal to allow closer contacts with heme 21 1. 

This division in domain of motions is similar to the one found in j^lj for deoxymyoglobin, 
provided that one notes the effect of the bending of C-terminal side in helix H, which 
implies a correlated motion with the proximal side, as suggested by fig. 01 where normalized 
correlations between His68 (F8) and the rest of the protein are shown. Here the crucial role 
of small helix F in the dynamics of the protein is underlined, since it contains the proximal 
histidine, and the division of the protein in domains of motion as described above is made 
more evident. 

The covariance minima in the plot of figure 0] are particularly meaningful, being found 
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FIG. 4: Normalized cross correlations between the proximal histidine F8 and the rest of the protein 
(c68j, hence the peak raising to 1.0 at j = 68): residues displaying significant anti-correlations with 
His68 are labeled in the plot. Like Phel9 (B9), they are strongly conserved throughout the trHb 



family 



being relevant to prevent solvent access to the heme pocket (E14, EF7 



7 [21]) 

running inside the protein matrix (G12, Hll 



211 ] ) , to stabilize 



gate between the heme pocket and an apolar cavity 



between His F8 and other key residues of the protein. Phel9 (B9), which has a bulky side 
chain, is responsible for the screening of the distal cavity from the acqueous environment 
outside the molecule, and is strongly conserved among trHbs; in a position occupied by the 
distal histidine in vertebrate Hbs we found Gln41 (E7) , hy drogen bonded to Tyr20 (BIO), 



which contribute to stabilize the heme-bound ligand [2l| and form a hydrogen bonding 
network in the heme pocket, which is believed to be responsible for the different ligand 
rebinding kinetics displayed by PtrHb and CtrHb in comparison with Mbs and Hbs j48j |. 

His68, taken here as a representative to deduce the motion of the whole proximal side 
from the covariance and correlation plots (figg.|BlEl), anti-correlates with hydrophobic Phe33 
(CD1) as well, another strongly conserved residue among trHbs: together with the previous 
three residues they line precisely the distal cavity facing the heme group. The anti-correlation 
of the distal and proximal sides is a clear sign of the concerted motion which may allow the 
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heme pocket to expand, thus making easier access to heme for ligands coming from the 



2l|, escaping the steric 



apolar cavity that links the inner part of the protein to the solvent 
hindrance of the distal side residues. 

Strong anti-correlation with the proximal histidine are displayed by Leu49 (E15), Leu85 
(G12) and Ala/Vall06 (Hll) as well: these residues lie at the bottom of the distal cavit y at 



the interface between the tunnel running inside the protein matrix and the heme pocket 
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FIG. 5: Components of normalized eigenvectors for the first two slowest modes of motion (1, 
top; 2, bottom; ratio of corresponding eigenvalues: 1.16), which bring a similar contribute to the 
dominant opening mechanism of the distal cavity, driven by the anticorrelated motions of the 
proximal (pre-F loop, helix F, loop FG and last part of helix H) and distal sides (helix C, CD 
loop and helix E especially). Residues with bulky side chains, strongly conserved in the family of 
trHbs and belonging to the hydrophic cluster preventing solvent access to the heme pocket [^| 
are spatially located near the residues with biggest components, highlighted in the plot: Phe33 
(CD1), Trp59 (EF7), Phe48 (E14). The latter acts as gating residue in trHbN from Mycobacterium 
Tuberculosis |49j |. 

The anticorrelated motion of the proximal and distal sides is made more visible by inspec- 
tion of the components of the eigenvectors corresponding to the first two slowest overdamped 
modes, plotted in figure El 
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Residues displaying the biggest deviations from their native positions are highlighted: 
they belong to loops between helices lining the heme pocket (CD and EF loops, pre-F 
region), and to helices enclosing the distal and proximal sides (helix B and E, helix F and 



H). These modes contribute substantially to t 



agreement with previous studies on globins 



le opening and closing of the distal side, in 
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42]. 




FIG. 6: Open (left) and closed (right) conformations of the distal cavity, obtained by adding or 
subtracting the rescaled eigenvector of the first slowest mode to the native positions of a carbons 
(scaling factor: 20). Most mobile regions in the first mode are coloured in red (loops) and purple 
(helices). Heme group and native structure are drawn in gray, as well as heme bound ligand 
stabilizing residues Tyr BIO and Gin E7 and hinges of the distal side opening mechanism - Pro 



EF3 and Gin G4 (figure drawn using VMD |50j and Raster3d 



A detailed view of the conformations visited by the first mode is shown in figure where 
the open and closed structures of the distal cavity are displayed, along with distal residues 
Tyr BIO and Gin E7. 

From the covariance plot (fig. EJ) and the component along the y axis of the second slowest 
eigenvector of figure El (although small, due to the normalization, which enhances most 
mobile regions like loops) one can notice the anticorrelation of the proximal histidine with 
the residues identified to line the passage leading to the heme pocket from the tunnel inside 

1(3 



[si 



MM- 



the protein (mainly Phel9, Leu85 and Alal06, already evidenced in fig. HJ) 2J, |28 
The anti-correlation between the two groups of residues hints at a possible mechanism for the 
passage of ligands to the heme pocket, through the enlargement of the gate: the presence of 
the apolar cavity has been proposed to contribute effectively to the fast rebinding of ligands 
on heme, together with the hydrogen bonding network in the distal side, as already pointed 
out 



Q, 28, 3, 51, 3- 




FIG. 7: Schematic representation of near equilibrium motions of groups of residues delimiting the 
heme pocket, inferred from covariance analysis: solvent accessible surfaces of residues delimiting 
the apolar cavity (res. 6, 12, 16, 17, 49, 53, 85, 89, lower left), and the distal cavity (proximal side: 
res. 64, 68, 71, upper right; distal side: res. 19, 20, 32, 33, 41, upper left) are shown with a 1.4 A 
radius probe. Ball-and-stick representation is used for His68 and the residues labeled in figure |IJ 
The cluster in the lower right (res. 48, 51, 52, 59, 105, 109) defines a narrower cavity [oil ]. 



drawn with VMD 



^ower right (res. 48, 51, 52, 59, 105, 109) defines a narrower cavity |5J|. Figure 
5fll ] , rendered with Raster3d • 



The combined motion of the main blocks is compatible with a pumping mechanism: 
according to the results obtained in this study, the native state conformation of the two 
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truncated hemoglobins is such that small displacements of the atoms, due to stochastic 
interactions with the solvent, produce an anti-correlated motion of the proximal and distal 
sides, which line the heme pocket, bringing atoms back to equilibrum positions. These 
movements may facilitate the diffusion of small ligands such as O2 and CO to heme through 
the protein tunnel, exploiting its volume variations 



A. Elasticity and time scale 

An estimate of the elastic constant 7 of the model can be computed fitting the experimen- 
tal temperature factors of the X-ray structures with the theoretical ones, obtained from the 
mean square displacements of C a s, according to equation (JTUJ). Following the method used 
in Q| to fit the data (i.e. by matching the areas of the surface enclosed by the two data sets) 
and averaging the values found for the two proteins yields 7 = 0.20 Am -1 , with a tolerance of 
0.05 Am -1 between averaged values (the introduction of heme in the network of interactions 
leads to a decreas of the value of the elastic constant, since it enhances the local connectivity 
of the buried residues in the heme pocket). The order of magnitude obtained for 7 agrees 
with estimated values for the elastic constant of single parameter models 00,0,0. 

The importance of friction due to the solvent in determining the rates of functional 
motions of proteins, as it slows down the relaxation times of large-scale displacements pre- 
dicted by normal mode analysis, has recently been underlined [5J| : in the framework of the 
Langevin dynamics introduced with equation (jHJ), we estimate order of magnitude for the 
reference decay time r of the first two modes of motion previously described, through an 
effective value for the friction coeffient /, chosen to be the same for all residues for simplicity. 
A lower limit for / is the value computed from an all-atom simulation in 5^|, whereas here 
whole residues are considered (although the effective radii associated with such an estimate 
are bigger than the Van der Waals radii of the atoms in the simulation, hinting at a collective 
character of the simulated displacements the motions predicted by the slowest modes 
involve many more residues in distant parts of the protein and a larger value for the friction 
may be expected). As an upper limit, the friction relative to the whole proteins (both PtrHb 
and CtrHb roughly fit a cubic box of side 3.5 nm) moving in water at physiological condi- 
tions is calculated from Stoke's law (see j^|, chapter 3). We obtained / ~ 4 70 pN m^ 1 s 
(similar ranges for the values of friction coefficients have been extracted from molecular 



18 



dynamics simulations |56[). 

The corresponding reference relaxation time r in the Langevin dynamics of equations |H1 
and El lies within the range 0.02 0.35 ns, while the relaxation time associated with eigen- 
mode i will be Tj = j 1 (where U is the eigenvalue relative to that eigenmode): the two slowest 
eigenmodes display relaxation times for the related motions approximately within the range 
0.2 ^ 3.5 ns. 

This range of time scale is compatible with CO rebinding kinetics of Mbs and Hbs, while 
PtrHb and CtrHb behave quite differently 4||: the explanation proposed for the different 
behaviour relies on the hydrogen-bonding network formed in the distal cavity of these trHbs, 
which is absent in invertebrate globins and is beyond the possibility of the simple model 
used here, which underlines instead the common characteristics of globins and trHbs. 

V. CONCLUSIONS 

It has been shown how a simple coarse-grained approach can bring insights into the 
functional motions of two small proteins of the truncated hemoglobins family, PtrHb and 
CtrHb, near equilibrium vibrational properties of the structures modeled as a gaussian 
network of interacting a carbons and (3 centroids. 

The key point in the analysis performed here is the information extracted from the co- 
variance matrix in its reduced form and from the two slowest modes of fluctuation: negative 
correlations between residues set far apart in the tridimensional structure are particularly 
useful, being non trivial and hinting at the collective character of the motions. 

This information has been used in the present work to confirm within such a simplified 
approach the mechanism which is believed to facilitate small ligands diffusion to the heme 
pocket and the iron atom. The cavity delimited by several key hydrophobic residues, pro- 
viding a path from the surface of the protein to the heme pocket 21, 24, 2sl l ^.. is able to 
enlarge its volume allowing the passage of small molecules to the distal side , as it 

is inferred from the anti-correlations between the displacements of the opposite sides of the 
heme pocket. 

Excitations, due to interactions between the molecule and the solvent, produce deviations 
from equilibrium followed by a decay towards the native state. The collective behaviour of 
the return back to equilibrium, produced by a superposition of overdamped motions, allow 
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the volume of the inner cavities to vary accordingly. 

Through a fit of the mean square displacements of a carbons from their minimum energy 
configuration with the experimental temperature factors for the two structures under study, 
a rough estimate of the order of magnitude of time scale for functionally relevant motions 
has been given, in reasonable agreement with known properties of globular proteins. 

This suggests the validity of the simple gaussian approach as a means to get a fast picture 
of the near-native functional motions of globular proteins, yet in agreement with the results 
obtained using more accurate and computationally demanding tools. 

The description given by the simple model used here does not provide atomic details, 
keeping the analysis at a coarse-grained level. Still the use of the effective (3 centroid for 
each residue, along with the C a , helps in characterizing with more adherence to reality the 
displacements of residues side-chains, thus getting a closer agreement with more detailed 
approaches. 
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